Skip to content

Demo: Add feature for Spark ORC writer to not persist field ids in files, using a new table property#133

Open
rzhang10 wants to merge 3 commits intolinkedin:li-0.11.xfrom
rzhang10:spark_orc_do_not_persist_field_ids
Open

Demo: Add feature for Spark ORC writer to not persist field ids in files, using a new table property#133
rzhang10 wants to merge 3 commits intolinkedin:li-0.11.xfrom
rzhang10:spark_orc_do_not_persist_field_ids

Conversation

@rzhang10
Copy link
Member

Adds a new table property "write.orc.no-field-ids.enabled" to control the Spark ORC writer behavior to not persist field-ids in the written file schema. This feature will be useful for Gobblin to ingest custom Hive/Iceberg hybrid table that share underlying files

@rzhang10 rzhang10 force-pushed the spark_orc_do_not_persist_field_ids branch from c3c4a87 to 44d2181 Compare December 12, 2022 23:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant